Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic filename generation #3222

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Filippo125
Copy link

@Filippo125 Filippo125 commented Jul 9, 2024

Pre-Request Checklist

  • Passes rubocop code analysis (try rubocop --auto-correct)
  • Tests added or adapted (try rake test)
  • Changes are reflected in the documentation
  • User-visible changes appended to CHANGELOG.md

Description

Add support for dynamic file name based on the attribute present in the node object
Add support for time date in the filename. See PR1462

@ytti
Copy link
Owner

ytti commented Jul 9, 2024

I had toyed around with this in master...filename - it has bit more flexibility into how to name the file, but no fetch method.

The fecth method is tricky with arbitrary filename.

@Filippo125
Copy link
Author

You are right.
I'm thinking how handle the fetch method properly.. but I don't understand very well how I aim for it.
My idea is to implement the same logic expect for time variable and return the most recent file. What do you think about it?

@ytti
Copy link
Owner

ytti commented Jul 9, 2024

You are right. I'm thinking how handle the fetch method properly.. but I don't understand very well how I aim for it. My idea is to implement the same logic expect for time variable and return the most recent file. What do you think about it?

I can't think of better solution, than that, if we mean mtime of file, and assume filesystem has that. I guess we could instead of mtime just use the time presented in name, but then we couldn't allow users to ever change formatting, without breaking it.

@Filippo125
Copy link
Author

@ytti I think i have done

@Filippo125 Filippo125 marked this pull request as ready for review July 18, 2024 09:41
@ytti
Copy link
Owner

ytti commented Jul 19, 2024

I am bit on the fence, why this is the right approached compared to my approach, which affects all outputs. Why wouldn't there be anyone who would want to emit like NOS/device? Instead of device, as file name to git?

@Filippo125
Copy link
Author

You might be right, but I think that providing the same functionality on other outputs require more care on the filename format validation.
From my point of view to goes in that direction, it is better to review the fetch and store signatures to uniform more and to be more flexyble. For example, why does store not receive the entire node, but only some fields?
But if you prefer to move the naming function outside the file.rb, i'll do it.

@ytti
Copy link
Owner

ytti commented Jul 19, 2024

Yeah I'm not sure #fetch is going to be reliable anyhow. I'm also skeptical if you can flatten the opts safely. And I don't think your version supports passing directory as part of filename, and I think my version supports that. So maybe you want yyyy/mm/filename, idk.

Generally, I don't have too good feeling about convergence of feature being needed and quality of implementation. It's not very important feature, I think, therefore it probably should be very well implemented to justify complexity.

@ytti
Copy link
Owner

ytti commented Jul 28, 2024

On top of my branch changes, I was thinking maybe 'fetch' can work, because we store the name. So we don't have to try to infer it?

Like so:

diff --git a/lib/oxidized/output/file.rb b/lib/oxidized/output/file.rb
index 70a8cd3..40c67d9 100644
--- a/lib/oxidized/output/file.rb
+++ b/lib/oxidized/output/file.rb
@@ -29,7 +29,7 @@ module Oxidized

     def fetch node, group
       cfg_dir   = File.expand_path @cfg.directory
-      node_name = name(node)
+      node_name = node.output.name || node.name

       if group # group is explicitly defined by user
         cfg_dir = File.join File.dirname(cfg_dir), group
diff --git a/lib/oxidized/output/output.rb b/lib/oxidized/output/output.rb
index a120373..cb8fb65 100644
--- a/lib/oxidized/output/output.rb
+++ b/lib/oxidized/output/output.rb
@@ -1,6 +1,7 @@
 module Oxidized
   class Output
     class NoConfig < OxidizedError; end
+    attr_accessor :name

     def cfg_to_str cfg
       cfg.select { |h| h[:type] == 'cfg' }.map { |h| h[:data] }.join
@@ -22,7 +23,7 @@ module Oxidized
         minute: "%02d" % time.min,
         second: "%02d" % time.sec,
       }
-      name_str % map
+      @name = name_str % map
     end
   end
 end

I'm not sure I like node_name = node.output.name || node.name, I do wonder if I rather want to use node.output.name explicitly, and guarantee it is always populated either with node.name or via the user configured string template.

@robertcheramy
Copy link
Collaborator

I'm not sure how to deal with this PR. My impression is that the discussion between @ytti and @Filippo125 is not finished.

If this PR has to be merged into master, I need first:

  • a positive feedback from @ytti (=> go for it ;-)
  • unit tests that validate the code and the different cases, including malicious ones
  • a code that passes rake test (rubocop + unit tests)

When these points have been completed, I could make a review of the code. I've got the feeling that node.email is the email from the commit and that fetch won't work for commits triggered by oxidized-web (/next).

I'm sorry being somewhat overcautious here, the PR introduces complexity and I don't understand its benefits yet - if I need history, the git output seems a better alternative to me.

@Filippo125
Copy link
Author

I'm testing the solution on real field at the moment.
The idea behind this PR is not only the history of the changes, but customize the filename thanks to properties of the device.
If this pr is to complex to maintains, you can close this and i create a new one only to uniform the params passed to the function in store and fetch.

@robertcheramy
Copy link
Collaborator

I'm less concerned about the complexity of the code than about de potential security implications of the PR and potential corner cases not working (fetch).

It may be better to create a dedicated output for it (customfile.rb or something like that).

It would help me to understand your use case. How does your output.file.filename looks like?

@Filippo125
Copy link
Author

Filippo125 commented Oct 30, 2024

I try to explain my use case better.
I use a http source to define a list of devices, including details such as name, IP, model, username, password, as well as additional variables (uuid and location).
In the example source for csv:

source:
  default: csv
  csv:
    file: "/workspaces/oxidized/local_dev/device.csv"
    delimiter: !ruby/regexp /:/
    map:
      name: 0
      ip: 1
      model: 2
      username: 3
      password: 4
    vars_map:
      uuid: 5
      location: 6

In the backup file i would to customise at least the filename output using something coming from the vars_map. In the best option also the directory (but for my use case, it is not mandatory).
I would modify the file output because i see in the past that someone ask also for the date/time field in the name.
As i said in the previous comment, maybe it is better to change and uniform only the firm of the methods to pass all the node variables and i write a custom output to handle my use case.

PS: The same should be applied also in the hook, especially for exec hook.

@robertcheramy
Copy link
Collaborator

Thank you for the explanation.

I have a similar use-case. I passed the attributes city and street as the group "city/street", without modifying the filename, wich produces a nice tree and paths like Stuttgart\Arnulf-Klett-Platz 42\R-1234.domain
Using custom attributes as a filename (or in my case folder) produces interesting errors when you have custom chars that your filesystem does not like.

I'm still weighing the pros and cons of this PR (or the option you propose - just changing the methods and letting the custom output out of oxidized). I need some more time for this, I'll get back to you after it.

@robertcheramy
Copy link
Collaborator

robertcheramy commented Nov 6, 2024

Okay, so after some reflexion, I think this PR should be changed in following points:

  • a new output, for example CustomFile or DynamicFile, should be created, the File output should not be changed
  • The parameters used in the filename must go through a positive list and every not allowed char has to be dropped (gsub(/[^\w.-]/,'')) or replaced by _ (gsub(/[^\w.-]/,'_')). I think that replacing with _ produces a somewhat nicer output, but this is a personal opinion - both are good solutions.
  • The documentation of the customfile [or dynamicfile] output (in Outputs.md) should state that fetch may not work when using attributes that can be changed externally, that chars not in the positive list will be stripped out [or replaced with _, see above] and that it may be a security risk to use attributes that may be changed externally (email of commit message)
  • new unit tests that validate the code and the different cases, including malicious ones are needed
  • the code must passe rake test (rubocop + unit tests)

Please also note that another PR has been merged into main, introducing a Namespace Oxidized::Output for outputs. This will impact your code. Sorry for the collision 😅.

I've understood the use case, and I'd like to thank you for the effort of writing this PR, including attributes you don't need yourself (date). I am aware that the points above mean more work, I hope that they do not seem exaggerated to you. If you need some help, especially the unit tests can be time consuming, just let me know. I have write access to the PR so I can directly contribute code into it.

@ytti
Copy link
Owner

ytti commented Nov 6, 2024

I'm not really sure I agree it should be another output model. I agree we have some icky problems, but I don't see how another output class solves any of those icky problems.

The implementation my branch proposes hardly even cares about output, and it can be used for any output at all, if we want to manipulate the name somehow. And it does nothing in any output, if the variable is not set. I also feel it is vastly simpler and more maintainable approach than the proposed approach.

I do feel like mangled output name can be useful regardles of output class. But I'm not sure it's useful enough to justify the cost.

@Filippo125
Copy link
Author

So, maybe the best option at the moment is close these PR. I would like to open a new one only to change the parameters passed in the method to give us the flexibility to use all the fields in the object. Let me know

@robertcheramy
Copy link
Collaborator

It seems to me that we currently cannot find a common position on this PR. I understand the point of @ytti, but it does not support node variables, and I have other currently priorities than investing time into it.

So your proposition to close the PR and only changing the parameters passed in output.store seem the most reasonable option to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants